Search CORE

14 research outputs found

Measuring the functional sequence complexity of proteins

Author: A Gammerman
AD Ellington
AKC Wong
AKC Wong
C Shannon
C Tuerk
David KY Chiu
David L Abel
DKY Chiu
DKY Chiu
DKY Chiu
DKY Chiu
DL Abel
DL Abel
DL Abel
DL Robertson
G Ertem
G Steinman
H Kobayashi
H Liao
HP Yockey
J Griesemer
Jack T Trevors
JF Chaparro-Riggers
JW Szostak
Kirk K Durston
KK Durston
L Gao
LM Rocha
LM Rocha
M Barbieri
M Oti
M Ronshaugen
MB Gerstein
O Weiss
PD Karp
R Backofen
S Oyama
WJL Cook
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Abel and Trevors have delineated three aspects of sequence complexity, Random Sequence Complexity (RSC), Ordered Sequence Complexity (OSC) and Functional Sequence Complexity (FSC) observed in biosequences such as proteins. In this paper, we provide a method to measure functional sequence complexity. Methods and Results We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families. Considerations were made in determining how the measure can be used to correlate functionality when relating to the whole molecule and sub-molecule. In the experiment, we show that when the proposed measure is applied to the aligned protein sequences of ubiquitin, 6 of the 7 highest value sites correlate with the binding domain. Conclusion For future extensions, measures of functional bioinformatics may provide a means to evaluate potential evolving pathways from effects such as mutations, as well as analyzing the internal structural and functional relationships within the 3-D structure of proteins.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

Author: AV Uzilov
B Gulko
B Knudsen
B Knudsen
B Morgenstern
D Sankoff
DH Mathews
DH Mathews
DH Mathews
DKY Chiu
DS Fields
E Rivas
G Storz
I Holmes
I Holmes
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
J Gorodkin
J Reeder
J Wuyts
J Wuyts
JE Hopcroft
JE Tabaska
JH Havgaard
M Zuker
M Zuker
MS Waterman
NR Pace
O Perriquet
PP Gardner
R Durbin
R Giegerich
R Green
R Lück
R Nussinov
RD Dowell
RD Dowell
Robin D Dowell
RR Gutell
RR Gutell
RR Gutell
S Batzoglou
S Griffiths-Jones
Sean R Eddy
SR Eddy
SV Muse
V Juan
VR Akmaev
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

Computing Highly Correlated Positions Using Mutual Information and Graph Theory for G Protein-Coupled Receptors

Author: A Pagano
AA Ivanov
AK Ramani
AR Ortiz
B Galitsky
BTM Korber
C Goh
C Hemmerich
C Yeang
Carson C. Chow
CD Strader
CE Shannon
CJ Harris
CS Sum
D Altschuh
DD Pollock
DD Pollock
DKY Chiu
DM Rosenbaum
E Neher
F Horn
F Knoflach
F Pazos
F Pazos
FY Carroll
G Casari
G Kleinau
G Kleinau
G Suel
G Swaminath
GB Gloor
H Herzel
H Jaschke
I Halperin
I Kass
IG Tikhonova
IN Shindyalov
J Dutheil
J Kim
J Thomas
JA Ballesteros
JA Ballesteros
JA Capra
JE Donald
JE Donald
JE Donald
JS Surgand
JW Kelly
JX Hu
K Palczewski
K Ray
K Ray
K Sjolander
K Ye
K Ye
KD Pruitt
KL Pierce
KY Yip
L Lewyn
L Oliveira
L Oliveira
L Oliveira
L Pritchard
LA Mirny
LC Martin
LH Heitman
M Raviscioni
M Scarselli
M Socolich
MA Hanson
Matthieu Louis
ME Olah
MJ Buck
ML Lopez-Rodriguez
MS Roulston
MW Dimmic
ND Clarke
NG Hoffman
O Lichtarge
O Lichtarge
O Noivirt
OF Lange
OV Kalinina
PJ Kundrotas
PR Gouldson
R Banerjee
R Brun
R Fredriksson
R Jothi
R Steuer
RI Dima
RM Williamson
RR Gutell
S Chakrabarti
S Costanzi
S Costanzi
S Costanzi
S Costanzi
S Govindarajan
S Litschig
S Madabushi
S Moore
S Moro
S Ohno
S Takeda
Sarosh N. Fatakia
SB Nagl
SB Nagl
SD Dunn
SGF Rasmussen
SJ Fleishman
SS Hannenhalli
Stefano Costanzi
SW Lockless
T Klabunde
T Klabunde
T Sato
T Warne
TD Schneider
TM Cover
V Batageli
V Cherezov
VP Jaakola
WP Russ
WR Atchley
WR Atchley
WR Taylor
Y Liu
Y Qi
Publication venue: Public Library of Science
Publication date: 05/03/2009
Field of study

G protein-coupled receptors (GPCRs) are a superfamily of seven transmembrane-spanning proteins involved in a wide array of physiological functions and are the most common targets of pharmaceuticals. This study aims to identify a cohort or clique of positions that share high mutual information. Using a multiple sequence alignment of the transmembrane (TM) domains, we calculated the mutual information between all inter-TM pairs of aligned positions and ranked the pairs by mutual information. A mutual information graph was constructed with vertices that corresponded to TM positions and edges between vertices were drawn if the mutual information exceeded a threshold of statistical significance. Positions with high degree (i.e. had significant mutual information with a large number of other positions) were found to line a well defined inter-TM ligand binding cavity for class A as well as class C GPCRs. Although the natural ligands of class C receptors bind to their extracellular N-terminal domains, the possibility of modulating their activity through ligands that bind to their helical bundle has been reported. Such positions were not found for class B GPCRs, in agreement with the observation that there are not known ligands that bind within their TM helical bundle. All identified key positions formed a clique within the MI graph of interest. For a subset of class A receptors we also considered the alignment of a portion of the second extracellular loop, and found that the two positions adjacent to the conserved Cys that bridges the loop with the TM3 qualified as key positions. Our algorithm may be useful for localizing topologically conserved regions in other protein families

Public Library of Science (PLOS)

Crossref

PubMed Central

Incorporating phylogenetic-based covarying mutations into RNAalifold for RNA consensus structure prediction

Author: A Esquela-Kerscher
AO Harmanci
B Gulko
B Knudsen
B Knudsen
C Workman
CB Do
CM Croce
Consortium The ENCODE Project
CR Woese
D Sankoff
DKY Chiu
DL Swofford
E Rivas
F Xia
IL Hofacker
IL Hofacker
J Felsenstein
JA Jaeger
JH Havgaard
JP Huelsenbeck
JS Mattick
JS Pedersen
L He
M Mandal
M Zuker
M Zuker
MA Larkin
MS Nicoloso
MS Waterman
Ping Ge
PP Gardner
PP Gardner
R Lorenz
R Nussinov
RD Dowell
RJ Klein
RR Gutell
RR Gutell
RR Sokal
S Washietl
S Will
SE Seemann
SE Seemann
SH Bernhart
Shaojie Zhang
SR Eddy
SR Eddy
The FANTOM Consortium
TR Mercer
WM Fitch
Y Sakakibara
Z Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Vinylsulfonates in the carbene cyclization cycloaddition cascade reaction

Author: Chiu P
Chu JCK
Wong DKY
Publication venue: The University of British Columbia.
Publication date: 01/01/2009
Field of study

Rising Stars of Research (RSR) 2009 - National Undergraduate Science and Engineering Research Poster Competition, Vancouver, B.C., 19-22 August 2009

HKU Scholars Hub

An Evolutionary Clustering Algorithm for Gene Expression Microarray Data Analysis

Author: Chan KCC
Chiu DKY
Ma PCH
Yao Xin
Publication venue
Publication date
Field of study

Clustering is concerned with the discovery of interesting groupings of records in a database. Many algorithms have been developed to tackle clustering problems in a variety of application domains. In particular, some of them have been used in bioinformatics research to uncover inherent clusters in gene expression microarray data. In this paper, we show how some popular clustering algorithms have been used for this purpose. Based on experiments using simulated and real data, we also show that the performance of these algorithms can be further improved. For more effective clustering of gene expression microarray data, which is typically characterized by a lot of noise, we propose a novel evolutionary algorithm called evolutionary clustering (EvoCluster). EvoCluster encodes an entire cluster grouping in a chromosome so that each gene in the chromosome encodes one cluster. Based on such encoding scheme, it makes use of a set of reproduction operators to facilitate the exchange of grouping information between chromosomes. The fitness function that the EvoCluster adopts is able to differentiate between how relevant a feature value is in determining a particular cluster grouping. As such, instead of just local pairwise distances, it also takes into consideration how clusters are arranged globally. Unlike many popular clustering algorithms, EvoCluster does not require the number of clusters to be decided in advance. Also, patterns hidden in each cluster can be explicitly revealed and presented for easy interpretation even by casual users. For performance evaluation, we have tested EvoCluster using both simulated and real data. Experimental results show that it can be very effective and robust even in the presence of noise and missing values. Also, when correlating the gene expression microarray data with DNA sequences, we were able to uncover significant biological binding sites (both previously known and unknown) in each cluster discovered by EvoCluster.Department of Computin

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

University of Birmingham Research Portal

The rhodium-catalyzed carbene cyclization cycloaddition cascade reaction of vinylsulfonates

Author: Chiu P
Chu JCK
Jäger A
Lam SK
Liu LL
Merten S
Metz P
Shi B
Wong DKY
Wong WT
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

Vinylsulfonates have proved to be excellent dipolarophiles for carbonyl ylides derived from diazoketones in rhodium-catalyzed intramolecular cycloadditions. Polyfunctional substrates, such as 8 and (+)-15, were readily available from hydroxy esters, e.g. 1 and the cyclopenta-1,3-dione 10, respectively, and the resulting polycyclic sultones were formed under mild reaction conditions in high yields with very good diastereoselectivities. A ruthenium-catalyzed asymmetric transfer hydrogenation was found to desymmetrize the meso-cyclopenta-1,3-dione 12 efficiently. © 2009 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.link_to_subscribed_fulltex

HKU Scholars Hub

Report of Doctors' Fee Survey 1994

Author: Chiu JSP
Fang D
Fung WYL
Ko WM
Lam TH
Lee DKY
Lee JMH
Lee KH
Leong CH
Li DKT
So KM
Publication venue: Hong Kong Medical Association.
Publication date: 01/01/1994
Field of study

HKU Scholars Hub

Information theory and the ethylene genetic network

Author: Arbib M
Chiu DKY
Gatlin LL
Grosse I
Helms V
Jones DS
José Díaz
José S. González-García
Kauffman SA
Khinchin AI
MacKay DJC
Quastler H
Taiz L
Thom R
Watson JD
Weaver W
Yockey H
Publication venue: Landes Bioscience
Publication date
Field of study

The original aim of the Information Theory (IT) was to solve a purely technical problem: to increase the performance of communication systems, which are constantly affected by interferences that diminish the quality of the transmitted information. That is, the theory deals only with the problem of transmitting with the maximal precision the symbols constituting a message. In Shannon's theory messages are characterized only by their probabilities, regardless of their value or meaning. As for its present day status, it is generally acknowledged that Information Theory has solid mathematical foundations and has fruitful strong links with Physics in both theoretical and experimental areas. However, many applications of Information Theory to Biology are limited to using it as a technical tool to analyze biopolymers, such as DNA, RNA or protein sequences. The main point of discussion about the applicability of IT to explain the information flow in biological systems is that in a classic communication channel, the symbols that conform the coded message are transmitted one by one in an independent form through a noisy communication channel, and noise can alter each of the symbols, distorting the message; in contrast, in a genetic communication channel the coded messages are not transmitted in the form of symbols but signaling cascades transmit them. Consequently, the information flow from the emitter to the effector is due to a series of coupled physicochemical processes that must ensure the accurate transmission of the message. In this review we discussed a novel proposal to overcome this difficulty, which consists of the modeling of gene expression with a stochastic approach that allows Shannon entropy (H) to be directly used to measure the amount of uncertainty that the genetic machinery has in relation to the correct decoding of a message transmitted into the nucleus by a signaling pathway. From the value of H we can define a function I that measures the amount of information content in the input message that the cell's genetic machinery is processing during a given time interval. Furthermore, combining Information Theory with the frequency response analysis of dynamical systems we can examine the cell's genetic response to input signals with varying frequencies, amplitude and form, in order to determine if the cell can distinguish between different regimes of information flow from the environment. In the particular case of the ethylene signaling pathway, the amount of information managed by the root cell of Arabidopsis can be correlated with the frequency of the input signal. The ethylene signaling pathway cuts off very low and very high frequencies, allowing a window of frequency response in which the nucleus reads the incoming message as a varying input. Outside of this window the nucleus reads the input message as an approximately non-varying one. This frequency response analysis is also useful to estimate the rate of information transfer during the transport of each new ERF1 molecule into the nucleus. Additionally, application of Information Theory to analysis of the flow of information in the ethylene signaling pathway provides a deeper insight in the form in which the transition between auxin and ethylene hormonal activity occurs during a circadian cycle. An ambitious goal for the future would be to use Information Theory as a theoretical foundation for a suitable model of the information flow that runs at each level and through all levels of biological organization

Crossref

PubMed Central

Detecting and Visualizing the Change in Classification of Customer Profiles based on Transactional Data

Author: A Subramanian
D Ruta
DKY Chiu
E Ngai
GI Webb
H Mannila
H Yan
HS Song
I Žliobaite
L Kuncheva
M Böttcher
M Böttcher
M Böttcher
M Hall
R Battiti
R Klinkenberg
R Polikar
RC Holte
X Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref